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Abstract 

The main theorem in Judge and Mittelhammer [Judge, G. G., and Mittelham- 
mer, R. (2004), A Semiparametric Basis for Combining Estimation Problems under 
Quadratic Loss; JASA, 99, 466, 479^87] stipulates that, in the context of nonzero 
correlation, a sufficient condition for the Stein rule (SR)-type estimator to dominate 
the base estimator is that the dimension k should be at least 5. Thanks to some re¬ 
fined inequalifies, fhis dominance resulf is proved in ifs full generalify; for a class 
of esfimafors which includes fhe SR esfimafor as a special case. Namely, we prove 
fhaf, for any member of fhe derived class, ^ ^ 3 is a sufficienl condifion regardless of 
the correlation factor. We also relax the Gaussian condition of the distribution of the 
base estimator, as we consider the family of elliptically contoured variates. Finally, 
we waive the condition on the invertibility of the variance-covariance matrix of the 
base and the competing estimators. Our theoretical findings are corroborafed by some 

simulation sfudies, and fhe proposed mefhod is applied fo fhe Cigareffe dafasef. 
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1 Introduction and statistical model 

1.1 Introduction 

The multiple regression model is a eommon statistieal tool for investigating the rela¬ 
tionship between a response variable and several explanatory variables. One of the main 
issues in regression analysis eonsists in estimating the regression eoeffieients. In partieular, 
in the eontext of a linear regression model, it is eommon to use the ordinary least squares 
estimator (OLSE). Indeed, under the normality of the errors term, OLSE is known to be the 
maximum likelihood estimator as well as the minimum varianee unbiased estimator. How¬ 
ever, in ease some prior information (from outside the sample) is available, OLSE may 
not be optimal. Eor instanee, this prior information may be due to past statistieal investi¬ 
gations, when these investigations eould have eoneluded that some regression eoeffieients 
are not statistieally signifieant. Another souree of prior information may be the expertise 
in a eertain field, whieh establishes an assoeiation between the regressor variables. Sueh a 
situation arises in eeonomie theory where, for example, it is eommon to eonsider that the 
sum of the exponents in a Cobb-Douglas produetion (see Douglas and Cobb, 1928) is equal 
to one. 

Erom the statistieal inferenee point of view, it is important to ineorporate the available 
prior information in the estimation method in order to improve upon the OLSE. Eor in- 
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stance, if such prior information can be expressed in the form of exact linear restrictions 
binding the regression coefficients, instead of using the OLSE, one can resort to a com¬ 
peting estimator which is also known as the restricted least squares estimator (RLSE); it is 
known that the RLSE dominates the OLSE in such cases. In the sequel, the OLSE will be 
referred to as the base estimator while the RLSE will be referred to as the restricted esti¬ 
mator or the competing estimator. Thus, in the case where some exact prior information 
is available, the practitioners should use the restricted estimator in order to estimate the 
target parameter while if only the sample information is available, the base estimator is to 
be preferred. 

Nevertheless, in some circumstances, the prior information is nearly correct and thus, 
we want to incorporate an additional information but we are not completely sure about 
it. Such uncertainty about the additional information may be induced by a change in the 
phenomenon underlying the regression model. Another context is the one where the prior 
information comes from experts in a field, the uncertainty reflects the imprecision in the 
experts’ information or judgements. In the case where the prior information is that, from the 
past statistical investigations, some regression coefficients are not statistically significant, 
the uncertainty may reflect the fact that a field specialist believes that the nonsignificant 
explanatory variables are important. 

In these cases, we have to choose how to incorporate uncertain prior information into 
the inference procedure. Technically, in order to use both the sample and the uncertain 
prior, we can combine the base estimator and the restricted estimator and thus it is important 
to find an optimal combination. In the context of the linear regression model. Judge and 
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Mittelhammer (2004) proposed a Stein-type estimator and derived a suffieient eondition for 
the risk dominanee of Stein-type estimator relative to a eertain base estimator. However, 
the main result, in Judge and Mittelhammer (2004)[JM], has some limitations. First, the 
error term is supposed to be normally distributed. Seeond, the varianee-eovarianee matrix 
of the joint distribution of the base estimator and the eompeting estimator is supposed to be 
invertible. This last assumption exeludes, for example, a ease where the prior information 
is about the non-signifieanee of some regression eoeffieients. Third, the derived suffieient 
eondition is too restrietive in the sense that it exeludes the ease of a multiple regression 
model with less than five regressors. Thus, the eondition in JM (2004) is not applieable 
to the eases of quadratie or eubie regression models. However, in many applieations (see 
Ashton et ah, 2008, Fernandez-Jurieie et al, 2003, among others), if a linear fit is not 
appropriate, a quadratie or eubie regression proves to be a simple and an adequate model. 
The last example is the Cigarette dataset produeed by the USA Federal Trade Commission 
whieh ean be found in Mendenhall and Sineieh (1992). For this data set, the method in 
JM (2004) is not applieable sinee we have only three explanatory variables. In SeetionlH 
we analyse this dataset and we show that our method performs very well. 

In this paper, we generalize in four ways the main result in JM (2004) whieh gives 
a suffieient eondition for the risk dominanee of Stein-type estimator relative to a eertain 
base estimator. First, we present a elass of estimators whieh ineludes as a speeial ease 
the Stein rule-type estimator given in JM (2004). Seeond, we relax the eondition on the 
dimension of the parameter spaee. Third, we waive the eondition on the invertibility of 
the varianee-eovarianee matrix of the base estimator and the eompeting estimator. Thus, 
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the proposed methodology works also in the case where the practitioners suspect some 
linear restrictions binding the regression coefficients. Fourth, we extend the main result 
to the case of a family of elliptically contoured distributions. To this end, recall that the 
normal distribution is a member of the elliptically contoured distributions, and, as explained 
in Provost and Cheong (2000), many test statistics and optimality properties underlying 
Gaussian random samples remain unchanged for elliptically contoured random samples. 
For further discussions and advantages of elliptically contoured distributions, we refer for 
example to Abdous et al. (2004), Liu et al. (2009) and references therein. Finally, the main 
key for establishing our results consists in deriving some inequalities and bounds which are 
more refined than that used in JM (2004). 

The remaining of this paper is organized as follows. Section [L2] presents the statistical 
model which is given in JM (2004) as well as the highlights of our contributions. In Sec¬ 
tion |2l we present a class of Stein rule-type of estimators and their risk function. Section [3] 
gives the main results of this paper in the Gaussian case and, more generally, in the ellipti¬ 
cally contoured random case. We also show, in Section [3l that the proposed method works 
in the context where the variance-covariance of the base estimator and the competing esti¬ 
mator is singular. In Section S we present some simulation results for small sample sizes 
as well as an analysis of a real data set. Section [5] gives some concluding remarks. For the 
convenience of the reader, technical proofs are given in the Appendix. 
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1.2 Statistical model and main contributions 


In this section, we recall the statistical model and the assumptions as well as some 
preliminary results which are given in JM (2004). Thus, this section presents only the 
model for which the error term is normally distributed. As mentioned in the Introduction, 
this is a preliminary step as we show later that the result established under the normality 
assumption holds also in the cases of elliptically contoured variables. 

Following JM (2004), we consider the estimation problem of a ^-dimensional location 
parameter vector when one observes an n-dimensional sample vector y such that y = Xj5 + 
e, where X is an nxk design matrix of rank k and £ is an n-dimensional random vector 
such that E(e) = 0 and cov(e) = a^/„. Further, as in the quoted paper, we consider the 
scenario where there exists some uncertainty concerning the above statistical model, which 
leads to uncertainty concerning the appropriate inference method. For more details about 
these issues, we refer, for example, to JM (2004), Saleh (2006), Hossain et al. (2009), 
Morris and Lysy (2012) among others. 

In the case where the above statistical model is appropriate, it is natural to estimate 
the target parameter /3 by using the least-squares estimator (LS) 5^^ = {X^X)^^X'y. 
Further, in the context of an alternative statistical model, one can consider the competing 
estimator j 8 , which is such that E(j 8 ) = P + j, cov(j 8 ) = $, cov(j 8 ,j 8 ) = S. Thus, as 
in JM (2004), the two estimators and are assumed to be correlated, and may be 
biased with bias 7 . In the context of uncertainty about which one of the two statistical 
models is more appropriate, it is common to consider an estimator which combines the two 


6 


estimators in an optimal way. Originally, this type of method was introdueed by James 
and Stein (1961). Over the last 50 years, numerous papers have been written around the 
topie so that it would be impossible to summarize all of them. To give some elosely related 
referenees, we mention Boek (1975), Judge and Boek (1978), JM (2004), Saleh (2006), 
Nkurunziza and Ahmed (2010), Nkurunziza (2011), and Tan (2015) and referenees therein. 

In our paper, we extend the following: JM (2004) stipulate that (see their main theo¬ 
rem), in the ease of nonzero eorrelation between the base estimator j8 and the alternative 
estimator j8, a suffieient eondition for the Stein rule (SR)-type estimator to dominate the 
base estimator is k ^ 5. 

In this paper, we extend this result in four ways. First, we eonstruet a elass of estimators 
whieh ineludes as a speeial ease the SR estimator given in JM (2004). Seeond, we prove 
that, regardless of the presenee of eorrelation, the eondition k^3 remains suffieient for any 
estimator of the proposed elass of SR estimators to dominate in mean squared error the base 
estimator. The impaet of this finding eonsists in the faet that, unlike the result in JM (2004), 
the established method ean be applied to the ease where the number of regressors is less 
than five as, for example, the ease of a quadratie or a eubie regression model. Third, we 
also generalize the method in JM (2004) to the ease where the joint distribution of the base 
estimator and the restrieted estimator may be singular. This last result ean be very useful 
in the ease where the statistieian suspeets some linear restrietions binding the regression 
eoeffieients. This ineludes, for example, the ease where the prior information from past 
statistieal investigations is that some regression eoeffieients are not statistieally signifieant, 
while the expert in the field of applieation believes that the eorresponding explanatory 
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variables should be in the model. Fourth, we prove that the established results hold if the 
normality assumption is replaeed with that of elliptically eontoured variates. Technically, 
in order to derive our findings, we establish some inequalities which are more refined than 
that in JM (2004). Finally, let us note that the simulation results are in agreement with the 
above theoretical findings. More specifically, the simulations show that the risk dominance 
of some SR estimators increases as the correlation increases. 


2 A class of Stein rule estimators and the risk function 


2.1 A class of Stein rule-type estimators 


In this subsection, we present a class of Stein rule (SR)-type estimators which includes 
as a special case the SR estimator in JM (2004). First, recall that the results given in this 
paper hold under a very general statistical model than that in JM (2004). More precisely, 
the established results hold whenever the estimators and j8 follow jointly an elliptically 
contoured distribution. First, suppose that the estimators and j8 are jointly Gaussian. 
Thus, let 
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( 2 . 1 ) 


where, as in JM (2004), the matrices A, $, H = A — S — S' + $ are assumed to be 
positive definite. This assumption will be waived in Subsection l3.2l to study the case where 
the matrices H and $ may be singular. Further, let c be real number and let h be real-valued 
measurable and square-integrable (with respect to the Gaussian measure). We consider the 
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following class of SR estimators 


/3"(/z,c)=/3+c/z /3j /3-i8 . 


( 2 . 2 ) 


Example 2.1. 1. For a given m-column vector e,let \\e\\^ = e' e = \xacQ{ee'). Ifh(^P,P 

l/||j8—j8|p,/ora known real number c, from (12.21) . we get the estimator 


that is the Stein rule (SR)-type estimator given in JM (2004). 

2. Let a = 5^trace((X'X)~^) — trace(S), S'^ = (n — k)^^\\y — 11 is an unbiased 

and/or a consistent estimator for H. Ifc= —a, = l/||j8—j8|p, the estimator 

in (12.21) becomes the Semiparametric Stein-Like (SPSL) estimator given in JM (2004). 
Namely, we have 


/3^ = /3- 


ll/3-/3|( 




3. Ifh = 0’ estimator in (12.21) yields the base estimator j8. 

4. If c = —1, /z j = 1, we have 1,1) = j8. 


(2.4) 


As an important point, in this paper, the random quantity /z is a statistie in 

the sense that it can be computed whenever we have the observations. As for the real 
value c which is assumed to be known in (12.21) . this is similar to that used in SR-estimator 
in JM (2004). The impact of replacing c by its corresponding consistent estimator should 
be similar to that in JM (2004). In practice, the value of c ean be obtained by using a 
re-sampling teehnique as the bootstrap. 
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2.2 Risk function 


The performance of the proposed class of estimators is studied under the quadratic loss 
function. Thus, the quadratic risk function, so-called the mean squared error (MSE), of the 
class of the estimators in (12.21) is 


MSE 


E 




then, by using (13.11) and (13.21) . we have 


MSE (^P^{h,c)^ = trace(A) —2cr}{h)+c^co{h), 


(2.5) 


where 


l{h)=E 


h 13,P) 13-15) p-p 


CO{h) = E 


hHprp] p-p 


( 2 . 6 ) 


assuming that these expectations are defined. Thus, from (12.51) . it is obvious that, for all 
c G (min{0,2r](/z)/(o(/?)}, max{0,2r](/z)/(o(/?)}), we have 


MSE (^p^{h,c)^ <MSE(j8), 

provided that r}{h) and ( 0 (h) exist. Eurther, from (12.5L one concludes that the optimal 
choice of c is = r/ (h) / (o{h) and thus. 


MSE (^p^(h,c*)^ =traco(A)-(ri^(h)/(o(h)). (2.7) 

Remark 2.1. Assuming that 7] (h) and (o(h) are defined, (12.71) implies that, for a fixed value 
ofco(h), the MSE of the SR estimator decreases as r\(h) increases. Further, for a fixed value 
ofr\(h), the MSE of the SR estimator decreases as (o(h) decreases. The simulation results 
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given in Section 14.71 are in agreement with this analysis. To make this idea more precise, 
we first note that, for a given h, the values ofco{h) and rj (h) depend both on y, the bias of 
the estimator fi, and on the covariance between the estimator fi and fi. In particular, the 
simulation results show that the risk dominance of the SPSL increases as the correlation 
increases. Also, the simulation results show that the risk dominance of the SPSL decreases 
as the norm of the bias increases. 

As mentioned above, the derivation of the MSE in (12.51) and (12.71) assumes the exis- 
tenee of ( 0 {h) and 'f]{h). Thus, it is important to derive the eonditions under whieh these 
expeetations are defined. To this end, we require that the funetion h satisfies the following 
assumption. 

Assumption The function h is such that ||x —is bounded i. e. 

\h{x,y)\ = 0{\\x-y\\-^). 

Remark 2.2. It should be noticed that the function h which gives the SR estimator satisfies 
the above assumption. Indeed, in this case, we have ||x—| = 1. Also, the function 
h = 0 which gives the base estimator satisfies the above assumption. As another example 

of a function h which satisfies the Assumption [Mi), one can take h{x,y) = -jpj 

for some p^2. 

Below, we prove that, under the above assumption, regardless of the presenee of eor- 
relation, the eondition kfi 3 remains suffieient for any estimator of the elass in (12.21) to 
dominate in mean square error the base estimator. In partieular, sinee the SR estimator is a 
member of the elass of the estimators in (12.21) . the established result proves that, regardless 
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of the presence of correlation, the condition ^ 3 remains sufficient for the SR estimator 
to dominate in mean square error the base estimator. We also prove that this conclusion 
holds if the normality assumption is replaced by that of elliptically contoured variates. 


3 Main results 

In this section, we present the main results of this paper. As an intermediate step, we 
derive below three propositions and a theorem which play a central role in deriving the 
main result. In summary, these results are useful in deriving a more refined inequality than 
that used in JM (2004). In order to simplify the presentation of the main results, we define 
some notations which will be used for the remaining of the paper. Let U = (t/{, where 
t/i = — /3 and U 2 = — /3. From (12.11) . we have 

(a 

(3.1) 

From the Cholesky decomposition, let P be a nonsingular matrix such that 3 = PP', and 
let 



1 \ 


/ 

/ \ 
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Ui 



A S 


V = 



(0,77, 




(u2j 


v 


/ 


z = p-\Ui-U2) and R=P'P. 


(3.2) 


Further, let 


W = 


/ \ 

Ui 


\ ^ ) 


( \ 
0 0 


F = 


\ 


P 0 




B = 


\ 


P 0 


/ 


(3.3) 


r] =E[t/{PZ/Z'PZ] , = E[|t/[PZ|/Z'PZ], m = E [l/Z'PZ] . (3.4) 
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Proposition 3.1. Suppose that Assumption holds, then there exists qo> 0 such that 

1. |77W| ^ qo^l^ where 7 ]^ is defined in (13■41) ; 

2. I £o(/z) I ^ £0 where (0 is defined in (13.41) . 

The proof of this proposition is given in the Appendix. From Proposition [3T1 it is elear 
that, in order to prove that 'f]{h) and (o{h) are defined, it is suffieient to prove that < oo 
and £0 < oo. Below, we establish a theorem whieh proves that, provided that (o <°°, 
and this implies that r}^ < oo. To introduee some notations, let denote the indieator 
funetion of the event G, let H hekx fc-symmetrie matrix, let X{H) denote the eigenvalue 
of H, and let Ai (H), be respeetively the first, the seeond, ..., the 

the eigenvalue of H. 

Proposition 3.2. Let a > 0, let y/i = max{|Ai(.B)|, |A 2 (.B)|,..., \^ 2 kiB)\}, where B is 
defined in (13.31) . and let W be the random vector in (13.31) . VTe have 

B{{\U[PZ\/Z'RZ) I{||w|Ka}} ^ C£Vi£0 /2. (3.5) 

The proof of this proposition is given in the Appendix. 

Proposition 3.3. Let C£ > 0, and let \j/o = min{Ai(i?), A 2 (i 2 ),..., Afe(i2)}. We have 

E{(| 1 T'F 1 T|/Z'i?z)l|||iy||>«}} ^ v/i [traee(A)+k + AtV] /(«Vo)- (3.6) 

The proof of this proposition is given in the Appendix. By eombining Propositions 13 .21 
and 13.31 we establish the following theorem whieh plays a eentral role in proving that rj 
exists whenever 0 < £0 < + 0 °. 
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Theorem 3.4. Let a > 0, then there exists M{a) > 0 such that 


ri\ <M{a)co+ \j/i {tr{A)+k + n'n) /{a^Yo), (3-V) 


where i/Zi and i/Zq cire given in Propositions 13.21 and \3.3\ respectively. 


Proof. Since -R is a positive definite matrix, we have 


77| ^E[|t/i'PZ|/(Z'RZ)] 


E 


{Z',U[)F{Z',U[)'\/{Z’RZ) 


(3.8) 


where F is given in equation (13.31) . Eurther, set W = (Z', U[)'. We have, 


W'PWL ) ^flW'FWL 

Z'RZ \ Z'RZ 


Then, by eombining Propositions 13.21 and 13.31 and by taking M{a) = we get the 


stated result. □ 


\W'FW\ 

Z'RZ 


= E 



By using Theorem 13.41 we establish the following eorollary whieh shows that (and 
so Ifi |) is bounded by a positive real number whieh is finite provided that 0 < ft) < + 0 °. 

Corollary 3.5. Suppose that the conditions ofTheorem \3.4\ hold. Then, 

|? 7 | < ft)+ vzf (traee {Aj+k + p'ij.) /(2xi/o), 

where i/Zi and i/Zq are given in Propositions \3.2\ and \3 . 31 respectively. 

Proof. Eor a = we have M{a) = = 1. Then, by using Theorem [34l we 

get the statement of the eorollary. □ 

Remark 3.1. It should be noticed that Propositions I3.2H3.3I Theorem 13.41 and Corol¬ 
lary 13.51 hold even in the case where W is not Gaussian, provided that the mean and the 
variance-covariance matrix of U are the same as the one given in (13.11) . 
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Remark 3.2. From Corollarv \3.5\ it should be noticed that, ifO < co < °°, then [t/ | ^ < 

oo. This is an interesting finding which shows that the nonzero correlation does not affect 
the condition for the risk dominance of the SR estimator relative to the base estimator. 
Thus, under normality, in order to guarantee the existence of the MSB in (12.51) and (12.71) . it 
is sufficient to let 3. 

Corollary 3.6. Under normality, k^ 3 implies 0 < CO < + 0 ° and |t] | ^ < +°o. 

Proof lfk^3, from the proof in JM (2004), 0 < (0 < + 00 . Then, by using Theorem 13.4[ 
we get |? 7 | ^ <+ 00 . □ 

Corollary 3.7. Suppose that Assumption holds. Under normality, k^3 implies 

0 < C0{h) < +00 and |t7(/2)| < +°°. 

The proof follows from Proposition 13. H and Corollary 13.61 

Note that Corollary 13.61 generalizes the main theorem in JM (2004). Further, Corol¬ 
lary [3]7] extends the result of Corollary 13.61 to a elass of SR-type estimators whieh ineludes 
the SR-type estimator in JM (2004) as a speeial ease. 

3.1 Extension to elliptically contoured random samples 

In this subseetion, we show that the result giyen in Corollary 13.61 remains yalid in the 
eontext of some elliptieally eontoured random samples. The importanee of sueh a family 
of distributions is the primary source of our motiyation. Indeed, as discussed in the liter¬ 
ature, elliptically contoured distributions haye been particularly useful in seyeral areas of 
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applications such as actuarial science (see Furman and Landsman, 2006, Landsman and 


Valdez, 2003), or eeonomies and finanee (see Bingham and Kiesel, 2001). 

Reeall that a elass of elliptieally eontoured distributions ineludes for example the mul¬ 
tivariate Gaussian, t, Pearson type II and VII, as well as Kotz distributions. To simplify the 
notation, let X ~ (/i, S;g) stand for a ^-eolumn random veetor distributed as an ellipti¬ 

eally eontoured veetor with mean /i and seale parameter matrix S, where S is a positive 
definite matrix, and g is the probability density funetion (p.d.f) generator. For the sake of 
simplieity, we eonsider the ease where the p.d.f of X S;g) is assumed to be written 

as 



(3.10) 


where /denotes the p.d.f of a random veetor whieh follows a normal distribution 
with mean /i and varianee-eovarianee S, and k{.) is a weighting funetion that satisfies 



K{t) \dt <oo. Note that the weighting funetion k{.) does not need to be nonnegative. 


In the ease where the funetion k{.) is nonnegative, then k{.) is a p.d.f, and the subelass 
of elliptieally eontoured distributions is known as a mixture of multivariate normal dis¬ 
tributions. For more details, we refer to Chmielewski (1981), Gupta and Verga (1995), 
Nkurunziza and Chen (2013) among others. In partieular, Gupta and Verga (1995) give 
the eonditions on the p.d.f generator g for the pdf of X ^ S’q (/i, S;g) to be rewritten as 
in (13.101) . From now on, we suppose that ({$ — /3)', (j8 — /3)'') has a p.d.f whieh ean be 
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rewritten as in (13.101) . and in a similar way to Seetion|2l let 



( \ 


/ ti o\ 
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\ 


Ui 


15-15 



A 

S 


u = 


= 


~ (g2k 

(0,/)', 





(u,j 


(^-Pj 


V 



) 


where 7 , A, S, $ are as defined in Seetion[2l 


(3.11) 


Theorem 3.8. Suppose that — P)', (j8 — /3)'j is distributed as in (13.1 II) and suppose 

poo 

that the weighting function k‘(.) satisfies 0 < / t |K‘(t) \dt <°°. Then, k^3 implies 

JQ 

0 < CO < +00 and I I ^ ^ < +°°. 


Proof First, reeall that the family of elliptieally eontoured distribution is elosed under 
linear transformations. Then, if ^(j8 —/3)', (j8 —/3)'j is distributed as in (13.111) . (12.51) and 
(12.71) hold, and then (13.41) holds with Z ~ (/r, Therefore, by using Remark [341 we 

eonelude that Corollary (13.51) holds. Further, as in JM (2004), we get 

max(Al(i^),A2(i^),...,Afc(i^))^ (^) ^ ® ^ min(Al(i^),A2(i^),...,A,(J^))^ 

Then, it suffiees to prove that j Z'zj < 00 for all 3. 

From (13.101) and Fubini’s Theorem, we have 



e(i /z'z) =^°°K(t)E,,^(l /ul^Uoyt, 


where Uq ~ ^ (/i, t ^Ik) - Note that, 
1 


Ef 




U'Uo 


= tEr 




{V~tUoy{V^Uo) 


= tE 


Xkitp'li)) ’ 


and then, 

0 <e(^ 1 tK{t)E{Xjf^{tlJ.'lJ.))dt < t\K{t)\dt <+^, 

for all k ^ 3. This eompletes the proof. □ 
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Corollary 3.9. Suppose that Assumption (^) holds. Also, suppose that — /3)', (j8 — /3)' 
is distributed as in (13.1 II) and suppose that the weighting function k{.) satisfies 

ncx> 

0 < / t\ K{t) \dt <o°. Then, 3 implies 0 < 03 {h) < +oo and \ ri{h) \ < +oo. 

Jo 

The proof follows by combining Proposition 13.11 and Theorem 13.81 

Remark 3.3. Note that in the Gaussian case, the weighting function K{t) is the Dirac delta 
function at t — \ (see Gupta and Varga, 1975). Thus, the conditions of Theorem 13.^1 hold 

poo 

since / t \ K(t) \dt =\. This shows that Corollarv \3.6\ and Corollarv \3. 7\ are special cases 
Jo 

of Theorem 13. g| and Corollary 13.91 respectively. 

3.2 Further extensions and statistical practice 

3.2.1 Singular distributions case 

In the previous sections, we derived the results under the assumption that the joint dis¬ 
tribution of and is not singular (see the relation (12.11) 1. This is a limitation which 
excludes, for example, the case where the imprecise prior information is in the form of a 
linear restriction between the parameters. Nevertheless, this is particulary the case where 
there is a restriction binding some regression coefficients. Indeed, such a situation is com¬ 
mon in economic theory where for example, as introduced by Douglas and Cobb (1928), 
the sum of the exponents in a Cobb-Douglas production is known to be one. Thus, in 
this subsection, we consider that ^(j8 — /3)', (j8 — /3)'j has the same distribution as in (12.11) 
where the matrices H and $ are (possibly) singular. For this kind of problem, the joint 
distribution of j8 and is (possibly) singular and thus, it is important to show how the 
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proposed methodology works in this ease. To this end, let q be the rank of H with q ^k. 
Briefly, we show that, under some eonditions, the established results hold by replaeing k 
by q. Namely, a suffieient eondition for the risk dominanee of any member of the elass of 
SR-type estimators relative to the base estimator is to let ^ ^ 3. Of eourse, this eondition 
implies that k^ 3 sinee k^ q. Namely, we suppose that the following eonditions hold. 
Assumption (J^): The function h (, j8 ] is a measurable function of^—^ only. 


Remark 3.4. Note that the function h which gives the SR estimator satisfies Assump¬ 
tion [Mf). Namely, for the SR estimator, we have h(fi,P^= 


Assumption (J^): There exists a symmetric and positive definite matrix A such that 


A^/^HA^/^ is idempotent and AHA 7 = Ay. 


Remark 3.5. It should be noted that in the case where H is invertible, it suffices to take 
A = Below we give another, more specific, example of a matrix A in the case where 
the prior information is a linear restriction on the regression coefficients. 

Theorem 3.10. Suppose that Assumptions hold. Under normality, 

k^ q ^ 3 implies 0 < (o{h) < and (h) | < -t-oo. 

The proof of this theorem is given in the Appendix. 


3.2.2 Special singular case: Linear restriction 

In this subsection, we show that the proposed methodology works in a very special case 
where the uncertain prior information refers to a certain linear restriction. In particular, we 
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consider the ease where the restrietion is of the form 


RP=r, 


(3.12) 


where R is a known q x k-full matrix with q ^ k; r is a known ^-eolumn veetor. With 
a suitable ehoiee of the matrix R and the veetor r, the eonstraint (13.121) yields the ease 
where some regression eoeffieients are not statistieally signifieant i. e. their eorresponding 
explanatory variables should be exeluded from the model. 

Under the eonstraint in (13.121) . the restrieted estimator for /3 is + J (^Rp - r j, 

where J = \ Then, if the restrietion in (13.121) does not 

hold and if the error is normally distributed, it ean be also verified that 

/ 

15-15 , A A-JRA 

(3.13) 


V 


/3-/3 

j8-/3 


\ 

~ ^2k 

/ 

(0,/)', 

( 

A 

A -JRA^ 

\ 

/ 

V 

^ A - JRA 

A -JRA^ 

) 


where y = J {RI5 — r) and A = {X'X)~^. Thus, here, the varianee-eovarianee matrix 

of (j8',j8') is singular and so is the varianee-eovarianee matrix of j8. The following propo¬ 
sition shows that the Assumption (J^) holds by taking A = A^K Thus, the proposed 
methodology works in this praetieal ease. 


Proposition 3.11. Suppose that the base and restricted estimators follow the distribution 
in (13.131) and let A = A^^, then, Assumption {^ 2 ) holds. 

Proof. We have H = JRA and y=J {RI5 — r). Then, the proof follows after applying 
standard algebraie eomputations. □ 

Remark 3.6. Actually, an even more general result could be proved. Indeed, by using the 
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similar transformation as in Nkurunziza (2013), one can extend Theorem \3.10\ to the case 


of singular elliptically contoured distribution. 

4 A simulation study and data analysis 

4.1 A Simulation Study 

In this section, we carry out Monte Carlo simulation studies to examine the mean square 
error (MSB) performance of the SPSL over the base estimator. To this end, we follow the 
similar sampling experiments as in JM (2004). Namely, fork = 3 and = 4, we consider 
the general linear model 

k 

Y, = X [/,.] /3 + £, = ^ /3,X [/, j]+eu for / = 1,2,..., n, 
f=i 

for small and large sample sizes. In order to save the space, we report only the results for 
« = 15 and n = 25. Although, not reported here, similar results hold for n = 50 and n = 125 
(they are available from the author upon request). For the dimension of the parameter vector 
/3, here, we focus only on the cases where k = 3 and k = 4, as the case k = 5 has been 
studied in JM (2004). The n x ^-matrix X and the noise £ were generated by following 
the sampling design described in JM (2004). For the convenience of the reader, we outline 
below this sampling design. 

Briefly, as in the quoted paper, for k = 3 and k = 4, the first column of the nxk matrix 
X is a column of unit values and the remaining columns of the X [i, .]’s are generated 
independently from a (k— 1)-dimensional normal distribution with a mean vector of Is, 
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standard deviations all equal to 1, and various levels of pairwise correlations. Further, the 
observations of the e/s were generated independently based on various normal probability 
distributions, all defined to have zero means over a range of standard deviations. For every 
sample size, 5,000 replications were carried out in order to compute the empirical quadratic 
risk estimates. 

As in JM (2004), we take j8 = [diag X'y, where diag (X'X) denotes akxk- 
diagonal matrix. Further, as in JM (2004), the comparison between the SPLS and LS 
estimators is based on the quantity called the relative mean square efficiency (RMSE) of 
the estimators with respect to LS, namely 

RMSE (proposed estimator) = risk (proposed estimator) j risk (LS). 

Therefore, we have 

RMSE (LS) = risk (LS) j risk (LS) = 1, RMSE (SLLS) = risk (SLLS) j risk (LS). 

Thus, a relative efficiency less than one indicates the degree of superiority of the new 
estimator over the LS estimator. 

More precisely, in this paper, by following the results in Subsection 5.2 of JM (2004), 
we examine the relative performance of the SPLS estimator as a function of the parameter 
norms /3'/3 and /y, where the parameter vector /3 is chosen such that 
/3'/3 G {1.2,4.8,10.7,19.0,29.7}, which are the values used in JM (2004, Eigure 3). 

Eor the small sample sizes and k = 3, the results are presented in Eigures[I]and[2l These 
figures show that, as the norms /3'/3 or /y increase, the risk of the SPSL estimator increases 
and approaches the risk of the LS estimator. Eurther, Eigures[3ll4]show a similar pattern for 


22 



(a) n= 15, a ==0.1,/t = 3 (b) n= 15, a = 0.25,/t = 3 




(c) « = 15, a 0.5, k = 3 (d) n = 15, a = 1, /t = 3 



p.® 2Q 25 30 0 5 10 ^15^ 


(e) « = 25, (7 = 0.1, /t = 3 (f) « = 25, (7 = 0.25, k = 3 




(g) n = 25, (7 = 0.5, k = 3 ih) n = 25, a = \, k = 3 

Figure 1: Relative effieiency versus /3'/3 
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(a) n= 15, a = 0.1,/t = 3 (b) n= 15, a = 0.25,/t = 3 




(c) « = 15, a = 0.5, k = 3 (d) n = 15, a = 1, /t = 3 



(e) « = 25, (7 = 0.1, /t = 3 (f) n = 25,0 = 0.25, k = 3 



(g) n = 25, a = 0.5, k = 3 


(h) « = 25, C7=l,/t = 3 


Figure 2: Relative effieieney versus /y 
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(a) n = 15, (J = 0.1, A: = 4 


(b) n= 15, a=:0.25,/t = 4 




(c) « = 15, a = 0.5, k = 4 (d) n = 15, a = 1, /t = 4 




(e) « = 25, C7 = 0.1,/t = 4 (f) « = 25, (J = 0.25,/t = 4 




(g) n = 25, (J = 0.5, k = 4- (h) « = 25, a = 1, /t = 4 


Figure 3: Relative effieiency versus /3'/3 
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(a) n= 15, a = 0.1,/t = 4 (b) n= 15, a = 0.25,/t = 4 



(c) « = 15, a = 0.5, k = A (d) n = 15, a = 1, /t = 4 



(e) h = 25, (7 = 0.1,/t = 4 (f) « = 25, (7 = 0.25,/t = 4 



(g) n = 25, a — 0.5, k = A 


(h) « = 25, C7=l,/t = 4 


Figure 4: Relative effieieney versus /y 
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the case where k = A and/or for the cases of moderate and large sample sizes. This result 
is in agreement with that given in JM (2004) for the cases where 5. Also, these figures 
confirm the findings in JM (2004) in that the correlation among the X variables increases 
the relative performance of the SPSL estimator over the LS estimator. 

4.2 Data analysis 

In this subsection, we illustrate the application of the proposed method to a real data 
set. The data set consists of a sample of 25 brands of cigarettes (see Mendenhall and 
Sincich, 1992). For each brand of cigarette, the measurements of weight as well as tar, 
nicotine, and carbon monoxide content are have been recorded. 

The choice of this data set is justified by several health and environmental issues con¬ 
cerning the cigarettes as mentioned by some medical studies. Thus, as explained in Menden¬ 
hall and Sincich (1992), ’"the United States Surgeon General considers each of these sub¬ 
stances hazardous to a smoker’s health”. The authors also mention that ’’past studies have 
shown that increases in the tar and nicotine content of a cigarette are accompanied by an 
increase in the carbon monoxide emitted from the cigarette smoke”. 

Accordingly, in order to illustrate the application of the proposed method, the response 
variable is taken as the carbon monoxide content, while the three covariates are: Xi: weight, 
X 2 : tar content, and A 3 : nicotine content. So, including the intercept, we apply the pro¬ 
posed method to the regression model for which n = 25 and k = A. It should be noticed 
that, for such a data set whose k < 5, the result in JM (2004) cannot be used to justify 
the efficiency of the SPSL over the base estimator. In contrast, the result established in 
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this paper justifies very well the relative effieieney of the SPSL estimator provided that 
the underlying distribution of the error terms is an elliptieally eontoured distribution. To 
give some numerieal deseriptive measures, the sample mean is 12.5280 for the response 
variable, and for the eovariates, the sample means are 12.2160, 0.8764 and 0.9703 

for the weight, tar and nieotine eontent, respeetively. The eorrelation eoeffieients between 
the response and the eovariates are shown in Table [B This table indieates that the weight 
and the tar eontent are highly eorrelated to the response, while the eorrelation between the 
nieotine eontent and the response is modest. Nevertheless, the eorrelation eoeffieients are 
statistieally signifieant at the 5 % level. Further, the eovariates seem pairwise eorrelated at 
signifieanee level 5 %. By applying the method, we obtain the point estimates based on the 

Table 1: Correlation between eovariates and response variables (p-value in parentheses) 



earbon monoxide 

Weight 

Tar eontent 

Nieotine eontent 

earbon monoxide 

1 

0.9575 

0.9259 

0.4640 


(-) 

(0.0000) 

(0.0000) 

(0.0195) 

Weight 

0.9575 

1 

0.9766 

0.4908 


(0.0000) 

(-) 

(0.0000) 

(0.0127) 

Tar eontent 

0.9259 

0.9766 

1 

0.5002 


(0.0000) 

(0.0000) 

(-) 

(0.0109) 

Nieotine eontent 

0.4640 

0.4908 

0.5002 

1 


(0.0195) 

(0.0127 ) 

(0.0109) 

(-) 


LS and SPSL estimators as reported in Tabled To asses the performanee of the estimators. 
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we compute the mean squared error based on a bootstrap method with 5000 replications. 


The relative efficiency of the estimators is given in Table [3l 


Table 2: Point Estimates 


Parameter 

LS 

SPSL 

Intercept 

3.2022 

3.9325 

Weight 

0.9626 

0.9645 

Tar 

-2.6317 

-1.3262 

Nicotine 

-0.1305 

0.8983 


Table 3: Relative efficiency (Bootstrap) 


Estimator 

LS 

SPSL 

Relative efficiency 

1 

0.7578 


From Table [3l one can clearly see that the relative efficiency of the SPSL estimstor is 
less than 1, that is the relative efficiency of the base estimator. This illustrates that SPSL 
dominates the base estimator. 


5 Conclusion 

In order to conclude, let us first recall that the main result in JM (2004) gives a sufficient 
condition for the Stein rule (SR)-type estimator to dominate the base estimator. In this pa¬ 
per, we provided more refined inequalities and bounds which are used in establishing this 
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result in its full generality. Namely, we generalized in four ways the result in JM (2004) 
whieh gives a suffieient eondition for the Stein rule (SR)-type estimator to dominate the 
base estimator. To this end, we provided an alternative and a versatile approaeh for estab¬ 
lishing the dominanee result in its full generality. In partieular, we proved theoretieally that 
the nonzero eorrelation does not ehange the eondition for the risk dominanee of the SR 
estimator. The impaet of this result is that, unlike the method in JM (2004), our method is 
also applieable to a wide range of regression models, ineluding, for instanee, the quadratie 
or eubie regressions, where the number of regressors is less than 5. In addition, we relax 
the eondition of normality of the sampling distribution of the base estimator. We also gen¬ 
eralize the method in JM (2004) to the ease where the varianee-eovarianee matrix of the 
base estimator and the restrieted estimator may be singular. The signifieanee of this finding 
is that our method is also effieient in the ease where the past statistieal investigations may 
have established that some regression eoeffieients are not statistieally signifieant while a 
field speeialist believes that the nonsignifieant explanatory variables are important. From 
the praetieal point of view, we evaluate numerieally the relative effieieney of a data-based 
semiparametrie Stein-like (SPSL) estimator. The simulation studies eorroborate this the- 
oretieal finding that the suffieient eondition, for the SPSL to dominate the LS estimator 
{k ^ 3), holds also regardless of the eorrelation faetor. Nevertheless, Figures 1-8 show that 
the eorrelation may amplify the risk dominanee of the SPSL. Finally, the proposed method 
is applied to the Cigarette dataset, produeed by USA Federal Trade Commission, for whieh 
k = A. An interesting result is that, by using a bootstrap method, we see that the SPSL dom¬ 
inates the base estimator. This finding is in agreement with the theoretieal result proved in 


30 


the present paper. 


A Appendix 


Proposition A.l. Let C be a mx m-symmetric matrix. Then, 

\x!Cx\ ^ max{|Ai(C)I, |A2(C)|,..., \Xm{C)\}x'x, for all m-column vectorx. 

Proof. Sinee the matrix C is symmetrie, there exist orthogonal matrix Q such that 
Q'CQ = D = diag(Ai(C),A2(C),...,Am(C)). Therefore, 

m 

X Cx = X QQ'CQQ'X = y'Dy = y = Q'x= (yi,y 2 , • • ■ 

i=\ 

Therefore, 

m m m 

|x'&| = |£A,-(C)j?|<£|A,-(C)|jNmax{|A|(C)|.|A2(C)|....,|A„(C)|}£j? 

!=1 1=1 i=l 

^ max{|Ai(C)|, |A2(C)|,..., |A,„(C)|}yy = max{|Ai(C)|, |A2(C)|,..., |A,„(C)|}yx, 


this completes the proof. 


□ 


Corollary A.l. Let C be a mx m-matrix. Let x and y m—column vectors. Then, we have 
\x'Cx\ ^ imax{|Ai(C' + C')|,|A 2 (C + C')|,...,|A,„(C + C')|Kx, and 


\y'Cx\ ^ 5 max{|Ai(Bo)|,|A 2 (Bo)|,..., |A 2 ,„(Bo)|}(x'x+yy), where Bq = 


0 a 

C 0 


Proof. Since x'Cx is a real number, we have x'Cx = ^x'(C + C')x. Therefore, since 
C + C' is a symmetric matrix, the first statement follows directly from Proposition |ATj To 
prove the second statement, note that y'Cx can be rewritten as 


yCx = {x,y) 


(0, C')' 


0 


/ i\i 

^,y)- 
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The rest of the proof follows from the first statement. 


□ 


Proof of ProDosition \3. 1\ Under Assumption there exists > 0 sueh that 
\HfP) \ S «o/llJ3 - Pip. Then, we have 

|»|(A)|SE(|A(i3,i3)||^-P|)<«oE(|j3-p|/||j§-P|p), 

and then, by using (12.11) and (13.11) . we get 

e(|j8-/3|/||j8-j8||2) =e{\U{PZ\ /z^RZ^ 

this proves the first statement of the proposition. Further, we have 

and then, by using (12.11) and (13.11) . we get 

E(l/||j8-j8f)=E(l/Z'i?Z)=m, 

this completes the proof. □ 


Proof of Proposition 13.21 Eet W = (Z',^7{)^ We have 


E { {\U[PZ\/{Z'RZ)) I{||w||^a}} = E {(|W'FW| /{Z'RZ)) I{||w|Ka}} , (A. 1) 


where F is defined in (13.31) . By using Corollary IA.21 we get 


\W'FW\ 

Z’BZ 


1 






Z'EZ 


which gives 


E{{\W'FW\/{Z'RZ)) I{||w|Ka}} P aViE{l/(Z'J2Z)} /2^ aVi®/2, (A.2) 


and the proof is completed. 


□ 
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Proof of Proposition \3.3\ We have, 
IW^FW 




Z'RZ 

By Courant’s Theorem, we have the inequality 


l/max{Al(i^),A2(i^),...,A^(i^)} ^ \\Zf/{Z'RZ) ^ l/t/Zo, 


whieh gives 


E[{\W'FW\/{Z>RZ)) ^2E{|W'FW|} /(aVo)- (A.3) 

Further, by using Corollary IA.21 we get 


E{ \W'FW\} ^ max{IAi(F + F')|, |A 2 (F + F')|,..., |A 2 „(F + F')|}E(W'W) /2. (A.4) 

\\ 

U h [A-Y,][F~^]' 

Note that F + F' = B and IE ~ A2k 


V 


V ° 


Then, E(lE'lE) = traee(A) + k + p'p, and then. 


V 


h (A-S)(F-^) 
P-I(A-S') A 


/ 


/ 


E{|1E'F1E|} ^ \j/i [tmce{A)+k + p^p] jl. 


(A.5) 


By eombining (IA.3I) . (IA.4I) and (IA.5I) . we get the statement of the proposition, whieh eom- 


pletes the proof. 


□ 


Proposition A.3. Suppose that Assumptions (^) and hold. Then, under normality, 
CO{h) < oo provided that q^3. 


Proof. We have ( 0 {h) = E 
exist qo sueh that 




, and then, under Assumption there 


co{h) ^ ^oE 


1 


/3-i8 


^ ^otraee(AHA)E 


1/ i8-j8 AHA /3-j8 
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Further, by using Theorem 5.1.3 in Mathai and Provost (1992, p. 199), we have 
- /3) AHA j ~ Xq (/AHAy) and then, 

l/AH a(^-^) =E[^^'2(y'AHA7)] < 1/(^-2). 


Henee, 0 ){h) < ^otraee( AH A )/{q — 2)< +oo provided that q^3, this eompletes the proof. 

□ 


Proof of Theorem \3.10\ We have 


r]{h)=E 


M/3,i8) /3-/3 (p-p 


and this gives 


p{h)=E 


M/3,i8) /3-i8 


+ E 


M/3,/3) /3-/3 /3-/3 


Then, by the triangular inequality. 




M/ 3 ,i 8 ) P-P 


+ 


M/3,/3) /3-/3 /3-/3 


Then, by using Jensen’s inequality and Assumption (^), we get 


|77(MI ^9-0 + 


M/3,i8) i8-/3 (p-p 


Eurther, sinee (^/3 — Pj and (^/3 — Pj are independent, and sinee, by Assumption (^), 
/z(j8, j8) is a measurable funetion of j8 — j8 only, we have 

|77(MI^9'o+ /E 

Then, by Cauehy-Sehwarz inequality. 


\ri{h)\ ^qo + 


h\P.P) P-P 


1/2 


= ^o + llrll®^''^W- 


Therefore, the proof follows from Proposition lA. 3 1 


□ 
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